We present the most recent release of our parallel implementation of the BFS and BC algorithms for the study of large scale graphs. Although our reference platform is a high-end cluster of new generation Nvidia GPUs and some of our optimisations are CUDA specific, most of our ideas can be applied to other platforms offering multiple levels of parallelism. We exploit multi level parallel processing through a hybrid programming paradigm that combines highly tuned CUDA kernels, for the computations performed by each node, and explicit data exchange through the Message Passing Interface (MPI), for the communications among nodes. The results of the numerical experiments show that the performance of our code is comparable or better with respect to other state-of-the-art solutions. For the BFS, for instance, we reach a peak performance of 200 Giga Teps on a single GPU and 5.5 Tera Teps on 1024 Pascal …
Multilevel parallelism for the exploration of large-scale graphs / Bernaschi, Massimo; Bisson, Mauro; Mastrostefano, Enrico; Vella, Flavio. - In: IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS. - ISSN 2332-7766. - 4:(2018), pp. 204-216. [10.1109/TMSCS.2018.2797195]
Multilevel parallelism for the exploration of large-scale graphs
Bernaschi, Massimo;Bisson, Mauro;Mastrostefano, Enrico;Vella, Flavio
2018
Abstract
We present the most recent release of our parallel implementation of the BFS and BC algorithms for the study of large scale graphs. Although our reference platform is a high-end cluster of new generation Nvidia GPUs and some of our optimisations are CUDA specific, most of our ideas can be applied to other platforms offering multiple levels of parallelism. We exploit multi level parallel processing through a hybrid programming paradigm that combines highly tuned CUDA kernels, for the computations performed by each node, and explicit data exchange through the Message Passing Interface (MPI), for the communications among nodes. The results of the numerical experiments show that the performance of our code is comparable or better with respect to other state-of-the-art solutions. For the BFS, for instance, we reach a peak performance of 200 Giga Teps on a single GPU and 5.5 Tera Teps on 1024 Pascal …I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.